Decision trees and random forests

Topic covered in the last tutorials

  • KNN
  • Building a machine learning model using Knn

Decision trees and random forests:

This is a type of decision-making algorithm that builds a tree-like model. This model is not that efficient, it works well on trained data and not much effective for unknown data. 

This is a decision tree example, it consists of three main parts

  1. Root: the main part from where the decisions start
  2. Internal nodes: it is an output of a decision and also it makes decisions.
  3. Leaf: It only takes decisions which means it has no further statements.

Now let us make a sample decision tree for a rough dataset.

Chest PainGood blood circulationBlocked arteriesWeightHeart disease
nonono125no
yesyesyes180yes
yesyesno210no
yesnoyes164yes

This dataset gives the information that whether a person has a heart disease or not based on few independent parameters. Now we would like to build a decision tree on this dataset. There are four independent variables in the data set. Consider any one variable and consider it as a root

As from the given dataset we can build the model using only columns, this model was built manually, This shows that decision trees aren’t that efficient, but the random forest is an extension of this decision forest, which is much efficient machine learning model.

Random Forests:

This is an extension of decision trees or can be said that it is more efficient than a decision tree model. It is a combination of decision trees. Let us consider the same table, there is data of four persons, in first decision tree we would take 4 datasets ( leaving one dataset and repeating another one ), similarly in this way thousand of decision trees are build and that is known as “RANDOM FORESTS”. 

In the upcoming tutorial, we would build a decision tree and random forest model and compare the accuracy of each other.

Spread knowledge

Leave a Comment

Your email address will not be published. Required fields are marked *